智能论文笔记

A Deep Learning Approach Using Masked Image Modeling for Reconstruction of Undersampled K-spaces

Kyler Larsen , Arghya Pal , Yogesh Rathi

分类：计算机视觉

2022-08-24

磁共振成像（MRI）扫描很耗时且不稳定，因为患者长时间仍在狭窄的空间中。为了减少扫描时间，一些专家已经尝试了不足采样的K空间，试图使用深度学习来预测完全采样的结果。这些研究报告说，可以节省多达20到30分钟的时间，这需要一个小时或更长时间。然而，这些研究都没有探索使用掩盖图像建模（MIM）来预测MRI K空间缺失部分的可能性。这项研究利用了11161个从Facebook的FastMRI数据集中重建的MRI和膝关节MRI图像的K空间。这使用基线移位窗口（SWIN）和视觉变压器体系结构测试了现有模型的修改版本，该窗口和视觉变压器体系结构可在未采样的K空间上使用MIM来预测完整的K空间，从而预测完整的MRI图像。使用Pytorch和Numpy库进行修改，并发布到GitHub存储库。模型重建K空间图像后，应用了基本的傅立叶变换来确定实际的MRI图像。一旦模型达到稳定状态，对超参数的实验有助于实现重建图像的精确精度。通过L1丢失，梯度归一化和结构相似性值评估了该模型。该模型产生的重建图像，L1损耗值平均为<0.01，训练完成后梯度归一化值<0.1。重建的K空间对训练和验证的结构相似性值均超过99％，并通过完全采样的K空间进行验证，而验证损失在0.01以下不断减少。这些数据强烈支持算法可用于MRI重建的想法，因为它们表明该模型的重建图像与原始的，完全采样的K空间非常吻合。

translated by 谷歌翻译

HTML版本

Unsupervised Instance and Subnetwork Selection for Network Data

Lin Zhang , Nicholas Moskwa , Melinda Larsen , Petko Bogdanov

分类：机器学习

2022-12-24

Unlike tabular data, features in network data are interconnected within a domain-specific graph. Examples of this setting include gene expression overlaid on a protein interaction network (PPI) and user opinions in a social network. Network data is typically high-dimensional (large number of nodes) and often contains outlier snapshot instances and noise. In addition, it is often non-trivial and time-consuming to annotate instances with global labels (e.g., disease or normal). How can we jointly select discriminative subnetworks and representative instances for network data without supervision? We address these challenges within an unsupervised framework for joint subnetwork and instance selection in network data, called UISS, via a convex self-representation objective. Given an unlabeled network dataset, UISS identifies representative instances while ignoring outliers. It outperforms state-of-the-art baselines on both discriminative subnetwork selection and representative instance selection, achieving up to 10% accuracy improvement on all real-world data sets we use for evaluation. When employed for exploratory analysis in RNA-seq network samples from multiple studies it produces interpretable and informative summaries.

translated by 谷歌翻译

Bagging is an Optimal PAC Learner

Kasper Green Larsen

分类：机器学习

2022-12-05

Determining the optimal sample complexity of PAC learning in the realizable setting was a central open problem in learning theory for decades. Finally, the seminal work by Hanneke (2016) gave an algorithm with a provably optimal sample complexity. His algorithm is based on a careful and structured sub-sampling of the training data and then returning a majority vote among hypotheses trained on each of the sub-samples. While being a very exciting theoretical result, it has not had much impact in practice, in part due to inefficiency, since it constructs a polynomial number of sub-samples of the training data, each of linear size. In this work, we prove the surprising result that the practical and classic heuristic bagging (a.k.a. bootstrap aggregation), due to Breimann (1996), is in fact also an optimal PAC learner. Bagging pre-dates Hanneke's algorithm by twenty years and is taught in most undergraduate machine learning courses. Moreover, we show that it only requires a logarithmic number of sub-samples to reach optimality.

translated by 谷歌翻译

Automata Learning meets Shielding

Martin Tappler , Stefan Pranger , Bettina Könighofer , Edi Muškardin , Roderick Bloem , Kim Larsen

分类：机器学习

2022-12-04

Safety is still one of the major research challenges in reinforcement learning (RL). In this paper, we address the problem of how to avoid safety violations of RL agents during exploration in probabilistic and partially unknown environments. Our approach combines automata learning for Markov Decision Processes (MDPs) and shield synthesis in an iterative approach. Initially, the MDP representing the environment is unknown. The agent starts exploring the environment and collects traces. From the collected traces, we passively learn MDPs that abstractly represent the safety-relevant aspects of the environment. Given a learned MDP and a safety specification, we construct a shield. For each state-action pair within a learned MDP, the shield computes exact probabilities on how likely it is that executing the action results in violating the specification from the current state within the next $k$ steps. After the shield is constructed, the shield is used during runtime and blocks any actions that induce a too large risk from the agent. The shielded agent continues to explore the environment and collects new data on the environment. Iteratively, we use the collected data to learn new MDPs with higher accuracy, resulting in turn in shields able to prevent more safety violations. We implemented our approach and present a detailed case study of a Q-learning agent exploring slippery Gridworlds. In our experiments, we show that as the agent explores more and more of the environment during training, the improved learned models lead to shields that are able to prevent many safety violations.

translated by 谷歌翻译

Optimal Weak to Strong Learning

Kasper Green Larsen , Martin Ritzert

分类：机器学习 | (统计)机器学习

2022-06-03

经典的算法adaboost允许转换一个弱学习者，这是一种算法，它产生的假设比机会略好，成为一个强大的学习者，在获得足够的培训数据时，任意高精度。我们提出了一种新的算法，该算法从弱学习者中构建了一个强大的学习者，但比Adaboost和所有其他弱者到强大的学习者使用训练数据少，以实现相同的概括界限。样本复杂性下限表明我们的新算法使用最小可能的训练数据，因此是最佳的。因此，这项工作解决了从弱学习者中构建强大学习者的经典问题的样本复杂性。

translated by 谷歌翻译

Fast Continuous and Integer L-shaped Heuristics Through Supervised Learning

Eric Larsen , Emma Frejinger , Bernard Gendron , Andrea Lodi

分类：机器学习

2022-05-02

我们在运营研究和机器学习（ML）的Nexus中提出了一种方法，该方法利用了从ML提供的通用近似器，以加速混合智能线性两阶段随机程序的解决方案。我们旨在解决第二阶段高度要求的问题。我们的核心思想是通过用快速而准确的监督ML预测替换确切的第二阶段解决方案，从而在在线解决方案时间中大量减少，同时，在第一阶段解决方案准确性中略有降低。当随着时间的推移反复解决类似问题时，在与车队管理，路由和集装箱院子管理有关的运输计划中反复解决类似问题时，对ML的前期投资将是合理的。我们的数值结果集中在与整数和连续L形切口中的问题类别解决的问题类别。我们的广泛的经验分析基于从随机服务器位置（SSLP）和随机多主背包（SMKP）问题的标准化家族基础。所提出的方法可以在不到9％的时间内解决SSLP的最难实例，而在SMKP的情况下，同一图为20％。在大多数情况下，平均最佳差距少于0.1％。

translated by 谷歌翻译

Risk-based implementation of COLREGs for autonomous surface vehicles using deep reinforcement learning

Thomas Nakken Larsen , Amalie Heiberg , Eivind Meyer , Adil Rasheeda , Omer San , Damiano Varagnolo

分类：机器人 | 人工智能

2021-11-30

自治系统正在成为海洋部门内无处不在和获得势头。由于运输的电气化同时发生，自主海洋船只可以降低环境影响，降低成本并提高效率。虽然仍然需要密切的监控以确保安全，但最终目标是完全自主权。一个主要的里程碑是开发一个控制系统，这足以处理任何也稳健和可靠的天气和遇到。此外，控制系统必须遵守防止海上碰撞的国际法规，以便与人类水手进行成功互动。由于Colregs被编写为人类思想来解释，因此它们以暧昧的散文写成，因此不能获得机器可读或可核实。由于这些挑战和各种情况进行了解决，古典模型的方法证明了实现和计算沉重的复杂性。在机器学习（ML）内，深增强学习（DRL）对广泛的应用表现出了很大的潜力。 DRL的无模型和自学特性使其成为自治船只的有希望的候选人。在这项工作中，使用碰撞风险理论将Colregs的子集合在于基于DRL的路径和障碍物避免系统。由此产生的自主代理在训练场景中的训练场景，孤立的遇难情况和基于AIS的真实情景模拟中动态地插值。

translated by 谷歌翻译

A Survey of Machine Learning Algorithms for Detecting Malware in IoT Firmware

Erik Larsen , Korey MacVittie , John Lilly

分类：机器学习

2021-11-03

这项工作探讨了机器学习技术在用于内互联网固件数据集上，以检测恶意尝试感染边缘设备或随后损坏整个网络。固件更新在IoT设备中罕见;因此，他们与漏洞取比。对此设备的攻击可以忽视，用户可以成为安全的弱点。恶意软件可能导致DDOS攻击，甚至在人民住宅等敏感区域上间谍。为了帮助缓解此威胁，本文采用许多机器学习算法来分类IOT固件，并报告了最佳执行模型。在一般的比较中，前三种算法是梯度升压，逻辑回归和随机林分类器。还探讨了包括卷积和完全连接的神经网络的深度学习方法，以及实验和经过验证的成功架构。

translated by 谷歌翻译

Intrusion Detection: Machine Learning Baseline Calculations for Image Classification

Erik Larsen , Korey MacVittie , John Lilly

分类：机器学习

2021-11-03

通过通过将网络攻击数据重新定位到图像格式，可以通过应用机器学习来增强网络安全，然后应用监督的计算机视觉和其他机器学习技术来检测恶意标本。探索性数据分析揭示了本研究中使用的十种恶意软件之间的相关性和少数特征。一般模型比较表明，考虑最有希望的候选者是轻型梯度升压机，随机林分类器和额外的树木分类器。卷积网络未能提供出色的分类能力，以简单完全连接的架构超越。大多数测试不能打破80％的分类精度并呈现低F1分数，表示可能需要更复杂的方法（例如，引导，随机样本和特征选择）来最大化性能。

translated by 谷歌翻译

Virus-MNIST: Machine Learning Baseline Calculations for Image Classification

Erik Larsen , Korey MacVittie , John Lilly

分类：机器学习

2021-11-03

病毒 - MNIST数据集是缩略图图像的集合，其风格类似于普遍存在的MNIST手写的数字。但是，这些通过将可能的恶意软件代码重塑到图像阵列中来投用。当然，它准备参与病毒分类器模型培训的基准测试中的作用。存在十种类型：九分类为恶意软件和一个良性。柯斯里审查揭示了在选择分类和预处理方法时必须考虑的不平等群体和其他关键方面。探索性分析显示了来自聚合度量（例如，像素中值值）的可能可识别特征，以及通过识别强相关来减少特征数量的方法。模型比较表明，光梯度升压机，渐变升压分类器和随机林算法产生了最高的精度分数，从而显示了深入审查的承诺。

translated by 谷歌翻译